library(tidyverse)
library(latex2exp)
library(gridExtra)
library(wesanderson)
library(plotly)
Refer to Typographical errors Problem 1.42. Assume that first-order regression model (1.1) is appropriate, with normally distributed independent error terms whose variance is \(\sigma^2 = 16\).
Answer: Using the model, \(Y_i = \beta_0 + \beta_1X_i + \varepsilon_i\), the likelihood function for the six observations is \[ L(\beta_0, \beta_1) = \prod_{i=1}^6 \frac{1}{\sqrt{2\pi \sigma^2}} \exp\left(-\frac{1}{2\sigma^2}(Y_i - \beta_0 - \beta_1X_i)^2\right) = \prod_{i=1}^6 \frac{1}{\sqrt{32\pi}} \exp\left( -\frac{1}{32}(Y_i - \beta_0 - \beta_1X_i)^2\right) \]
Answer: Using the maximum likelihood estimation method, the estimator for \(\beta_0\) is \(\hat{\beta}_0 = b_0\) and for \(\beta_1\) is \(\hat{\beta}_1 = b_1\). Hence,
typo_errors = read.csv('CH01PR42.txt', sep = '', header = FALSE,
col.names = c('y', 'x'),
colClasses = c('numeric', 'numeric'))
lm.fit_manual = function(X, Y){
b1 = sum((X - mean(X))*(Y - mean(Y))) / (sum((X - mean(X))^2))
b0 = mean(Y) - b1*mean(X)
return(c(b0, b1))
}
typo_mle = lm.fit_manual(typo_errors$x, typo_errors$y)
paste('b0 =', round(typo_mle[1], 3))
## [1] "b0 = 1.597"
paste('b1 =', round(typo_mle[2], 3))
## [1] "b1 = 17.852"
Answer: Create a function for the likelihood function that will take in values of \(\beta_0\) and \(\beta_1\).
typo_mle_fun = function(beta_zero, beta_one){
return(prod((32*pi)^(-1/2)*exp(-(1/32)*(typo_errors$y -
beta_zero -
beta_one*typo_errors$x)^2)))
}
Create a data frame storing the \(\beta_0\), \(\beta_1\) and likelihood values.
n = 350
typo_likelihood_matrix = matrix(rep(NA, n^2), nrow = n, ncol = n)
b0_val = seq(-10, 10, length = n)
b1_val = seq(17, 19, length = n)
for(i in 1:n){
row = c()
for(j in b1_val){
row = c(row, typo_mle_fun(b0_val[i], j))
}
typo_likelihood_matrix[i,] = row
}
typo_likelihood_df = data.frame(b0 = b0_val,
typo_likelihood_matrix)
colnames(typo_likelihood_df) = c('b0', b1_val)
typo_likelihood_df = typo_likelihood_df %>%
gather(key = b1, value = likelihood, colnames(typo_likelihood_df)[1:n+1])
Plot the 3D scatterplot.
plot_ly(data = typo_likelihood_df, x = ~b0, y = ~b1, z = ~likelihood,
marker = list(color = ~likelihood, colorscale = "RdBu",
showscale = TRUE)) %>%
add_markers() %>%
layout(scene = list(xaxis = list(title = "beta0"),
yaxis = list(title = "beta1"),
zaxis = list(title = "likelihood")))
By looking at the 3D plot and using the zoom feature, it is apparent that the likelihood is maximized by the maximum likelihood estimates found in part (b).
Using R’s which.max to find the maximum likelihood value, the following \(\beta_0\) and \(\beta_1\) estimates create the maximum likelihood value.
typo_likelihood_df[which.max(typo_likelihood_df$likelihood),]
## b0 b1 likelihood
## 52353 1.575931 17.8538681948424 4.100792e-07
Same result as before with more precision!